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Our  results  fall  into  the  three  major  areas  described  below. 


I.  Breaking  Intractability 


Since  I  have  kept  AFOSR  well  informed  of  progress  in  this  area  I  will  not  repeat  myself  here. 


The  following  five  papers  and  reports  deal  with  progress  in  this  area. 


1.  A  Surprising  and  Important  New  Result.  Report  to  AFOSR  by  J.  F.  Traub, 
February  25, 1993. 

2.  Recent  Progress  in  Information-Based  Complexity.  J.  F.  Traub  and 
H.  Wozniakowski.  Invited  paper.  Bulletin  European  Association  for  Theoretical 
Computer  Science,  October  1993,  Number  51,  pages  141-154. 

3.  Breaking  Intractability.  J.  F.  Traub  and  H.  Wozniakowski.  Published  as  cover 
story  Scientific  American  January  1994. 

4.  Development  and  Testing  of  Software  for  Multivariate  Integration.  Report  to 
AFOSR  by  S.  Paskov  and  J.  F.  Traub,  January  4, 1994. 

5.  Tractability  and  Strong  Tractability  of  Linear  Multivariate  Problems. 
H.  Wozniakowski.  To  be  published  in  the  March  1994  issue  of  the  Journal  of 
Complexity 
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•  Item  #1  is  a  report  to  AFOSR  introducing  the  concept  of  strong  tractability. 

•  Item  #2  is  an  invited  article  which  reviews  recent  progress  in  information- based 
complexity. 

•  Item  #3  is  an  invited  article  for  Scientific  American  which  reports  on  recent  progress 
in  breaking  intractability. 

•  Item  #4  is  a  report  to  AFOSR  on  the  status  of  development  and  testing  of  software 
for  multivariate  integration. 

•  Item  #5  is  the  first  publication  regarding  strong  intractability.  It  will  appear  in  the 
March  1994  issue  of  the  Journal  of  Complexity. 


II.  Monte  Carlo 

The  Monte  Carlo  Algorithm  With  A  Pseudorandom  Generator.  J.  F.  Traub  and 
H.  Wozniakowski.  Published  in  Mathematics  of  Computation,  January  1992,  Vol.  58,  pages 
323-339. 

The  current  method  of  choice  for  computing  multivariate  integrals  is  Monte  Carlo.  Of  course,  on 
a  computer  there  are  no  random  numbers,  only  pseudo  random  numbers.  There  is  a  huge 
literature  on  statistical  testing  of  pseudo  random  numbers.  However  these  tests  do  not  answer  the 
question  of  most  interest  to  the  user.  Are  the  good  properties  of  the  Monte  Carlo  algorithm  using 
random  numbers  preserved  if  pseudo-random  numbers  are  used?  In  this  paper,  which  we  believe 
to  be  the  first  on  this  topic,  we  prove  that  the  answer  is  yes  provided  some  care  is  taken.  For 
example,  in  d  dimensions  it  is  necessary  to  use  d  random  seeds. 


m.  Ill-Posed  Problems 

Linear  Dl-Posed  Problems  Are  Solvable  On  The  Average  For  All  Gaussian  Measures. 
J.  F.  Traub  and  A.  G.  Werschulz.  To  appear.  Math  Intelligencer,  1994. 

It  has  been  proven  that  ill-posed  problems  are  unsolvable  in  the  worst  -case  deterministic  setting. 
Yet  ill-posed  problems,  which  occur  in  many  applications,  must  often  be  solved. 

An  answer  may  be  provided  in  this  paper.  We  show  that  ill-posed  problems  are  solveable  on  the 
average  for  every  Gaussian  measure.  This  is  the  first  paper  on  the  average  case  analysis  of  ill- 
posed  problems. 


A  SJJRPRISING  AND  IMPORTANT  NEW  RESULT 

J.  F.  Traub 
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The  number  of  function  evaluations  sufficient  to  solve  important  problems  such  as 
multivariate  integration  and  multivariate  approximation  is  completely  independent  of  the 
number  of  variables! 

CONTEXT  FOR  THE  NEW  RESULT 
The  following  bullets  put  this  new  result  into  context. 

•  High-dimensional  problems  occur  in  numerous  applications  in  science  and 
engineering. 

•  Most  of  these  problems  cannot  be  solved  analytically.  They  have  to  be  numericalls' 

.solved,  approximately. 

•  Most  multivariate  problems  are  intractable  in  dimension.  A  typical  result  is  that  if 
accuracy  e  is  desired  and  there  are  d  variables,  then  the  computational  complexity  is 
(I/eK 

•  Thus,  if  a  two-place  answer  is  desired,  the  problem  is  1(K)  times  harder  for  each 
additional  variable.  If  eight-place  accuracy  is  desired,  the  problem  is  lOO.OOO.ODO 
times  harder  for  each  additional  variable. 

•  Although  the  physici.sts  at  Los  Alamos  did  not  know  about  computational 
complexity,  they  realized  they  could  not  .solve  certain  problems.  This  led  to  the 
invention  of  Monte  Carlo  methods.  For  example,  the  computational  complexity  of 
multivariate  integration  in  the  randomized  .setting  is  proportional  to  l/e“  and 
therefore  tractable. 

•  It  was  .shown  in  1989  that  Monte  Carlo  methods  cannot  be  used  to  break  or 
intractability  of  multivariate  approximation. 

•  An  alternative  to  the  randomized  .setting  is  the  average  case  .setting  in  which  w  e  .seek 

to  break  unsolvability  and  intractability  by  replacing  a  worst  case  guarantee  that  the 

error  is  less  than  the  threshold  e  w  ith  the  weaker  guarantee  that  the  expected  enor  is 

le.ss  than  e.  Note  that  this  is  a  deterministic  setting;  one  has  to  solve  the  problem  of 

optimal  sample  points.  - 
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•  In  1991  it  was  shown  that  multivariate  integration  is  tractable  on  the  average.  On  a 
power  scale,  the  average  computational  complexity  of  multivariate  integration  is 
proportional  to  1/e.  For  small  e  this  is  a  major  improvement  over  Monte  Carlo, 
although  for  a  different  error  criterion.  Optimal  .sample  points  were  obtained. 

•  Are  other  important  multivariate  problems  tractable  on  the  average?  In  1992  it  was 
shown  that  approximation  is  also  tractable  on  the  average.  On  a  power  .scale  the 
average  computational  complexity  of  multivariate  approximation  is  proportional  to 
1/e-.  Optimal  sample  points  were  obtained. 

•  In  the  result  stated  above  we  ignored  a  multiplicative  factor  depending  on  the 
dimension  d.  For  example,  the  average  computational  complexity  of  multivariate 
integration  is  .s'/tZ/l/e.  where  R(d)  is  a  multiplacative  factor  which  depends  only  on 
the  number  of  variables.  Good  theoretical  estimates  of  pd)  are  not  known  and 
obtaining  them  is  believed  to  be  very  hard. 


THE  NEW  RESLiLT 

•  An  entirely  new  approach  can  be  used.  We  get  rid  of  the  factor  pd). 

•  Specifically,  we  say  that  a  problem  is  strondv  tractable  if  the  number  of  function 
e\aluations  needed  for  the  .solution  is  completely  independent  of  the  number  of 
variables.  It  depends  onlv  on  a  power  of  l/£. 

•  This  seems  too  much  to  ask  for.  but  both  multivariate  integation  and  multivariate 
appoximation  arc  strongly  tractable  on  the  average! 

•  This  result  is  so  new  that  it  has  not  yet  been  written  up. 

•  The  result  is  eiven  bv  a  theorem  and  is  non-con.structi\el  That  is.  we  know  there 

^  0 

must  exist  evaluation  points  in  d  dimensions  which  make  integration  and 
approximation  suongly  tractable,  but  these  points  are  not  yet  know  n. 


FUTURE  RESEARCH 

An  exciting  new  result  suggest  new  que.stions  and  directions,  some  of  which  we  list  here. 

•  What  are  the  points  of  evaluation  w  hich  make  multi\  ariate  integration  and 
multivariate  approximation  strongly  tractable?  This  is  a  major  challenge. 

•  We  are  currently  implementing  and  te.sting  .software  for  multivariate  integration 
using  the  known  points  which  make  this  problem  tractable  on  the  average  (but  not 
.stronulv  tractable). 

•  We  then  plan  to  implement  and  test  tliis  .software  for  a  network  of  workstations. 

•  We  also  plan  to  implement  and  test  software  for  multivariate  approximation. 

•  It  has  been  shown  that  multivariate  integration  and  multivariate  approximation  are 
strongly  tractable.  What  other  problems  are  .strongly  tractable? 
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1.  Overview  of  Information-Based  Complexity 

The  goal  of  this  article  is  to  report  some  of  the  recent  progress  in  information- based 
complexity,  which  for  brevity  will  be  denoted  as  IBC.  We  have  selected  topics  which  might 
be  of  particular  interest  to  the  EATCS  audience.  We  take  an  informal  approach  in  this 
article,  focusing  mainly  on  ideas.  For  precise  formulations  and  results,  as  well  as  proof 
techniques,  see  the  books  TW*[80j,  TWW  [83],  Novak  [88],  TWW  [88],  Werschulz  [91], 
and  recent  surveys  ,  PT  [87],  PW  [87],  TW  [91a,  91b],  Heinrich  [92],  and  Novak  [93]. 

We  begin  by  presenting  a  greatly  simplified  picture  of  computational  complexity  to 
indicate  where  IBC  fits  in.  For  our  present  purpose,  computational  complexity  may  be 
divided  into  two  branches,  discrete  and  continuous.  Continuous  computational  complexity 
may  again  be  split  into  two  branches.  The  first,  which  we’ll  call  continuous  combinatorial 
complexity,  deals  with  problems  for  which  the  information  is  complete.  Problems  where 
the  information  may  be  complete  are  those  which  are  specified  by  a  finite  number  of 
pzirameters.  Examples  include  linear  edgebraic  systems,  matrix  mviltiplication,  and  systems 
of  polynomial  equations.  Blum,  Shub  and  Smale  [89]  obtained  the  first  NP-completeness 
results  over  the  reals  for  a  problem  with  complete  information. 

The  other  branch  of  continuous  computational  complexity  is  IBC.  Typically,  IBC  studies 
infinite-dimen-sional  problems.  These  are  problems  where  either  the  input  or  the  output 
are  elements  of  infinite- dimensional  spaces.  Since  digital  computers  can  handle  only  finite 
sets  of  numbers,  infinite-dimensional  objects  such  as  functions  on  the  reals  must  be  replaced 
by  finite  sets  of  numbers.  Thus,  complete  information  is  not  avmlable  about  such  objects. 
Only  partial  information  is  available  when  solving  an  infinite-dimensioned  problem  on  a 
digital  computer.  Typically,  information  is  contaminated  with  errors  such  as  round-off 
error,  meeisurement  error,  and  hvunan  error.  Thus,  the  available  information  is  partial 
and/or  contaminated. 

We  wemt  to  emphasize  this  point  for  it  is  central  to  IBC.  Since  only  partial  and/or 
contaminated  information  is  available,  we  can  solve  the  original  problem  only  approxi¬ 
mately.  A  goal  of  IBC  is  to  obtain  the  computational  complexity  of  computing  such  an 
approximation. 

In  Figme  1  we  schematized  the  structure  of  computational  complexity  described  above. 


^  When  one  of  us  is  a  co-author,  the  citation  will  be  made  using  only  initials 
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Computational  Complexity 


Discrete  Complexity  Continuous  Complexity 


Information  Continuous 

Based-Complexity  Combinatorial  Complexity 


Figure  1 


The  motivation  for  studying  IBC  is  two-fold: 

(1)  Continuous  models,  typically  infinite-dimensional,  are  very  common  in  science, 
engineering,  economics,  and  even  in  finance.  Examples  of  the  mathematical  prob¬ 
lems  which  arise  from  these  models  are  partial  or  ordinary  differential  equations, 
multivariate  integration,  and  optimization. 

(2)  The  subject  matter  covered  by  IBC  is  rich  from  a  complexity  point  of  view  with 
many  results  and  numerous  open  questions,  as  we  hope  to  illustrate  in  this  article. 

Although  IBC  typically  studies  infinite-dimensional  problems  there  eu^e  important  ex¬ 
ceptions.  These  include  probabilistic  complexity  of  processor  sjnchronization  with  sto¬ 
chastic  delays,  Wasilkowski  [88a],  and  complexity  of  solving  large  linear  systems,  TW  [84], 
Nemirovsky  [91,  92]. 

IBC  is  formulated  as  an  abstract  theory;  see  the  Appendix.  The  applications  often 
involve  multivariate  functions  over  the  reals.  For  example,  in  multivariate  integration, 
the  integrand  is  a  multivariate  function.  In  optimization,  one  seeks  an  extremum  of  a 
multivariate  function  subject  to  multivariate  constreiints.  In  sm  initial-value  problem,  such 
as  the  wave  equation,  the  initial  condition  is  ageun  specified  by  a  multivariate  function. 

The  observation  that  a  function  over  the  reeds  cemnot  be  entered  into  a  digital  computer 
lies  at  the  heart  of  IBC.  (In  the  general  ceise,  an  element  of  an  abstract  space  cannot  be 
entered.)  We  call  a  multivariate  function  a  mathematical  input,  denoted  by  Imath-  Let  S 
be  a  linear  or  nonlinear  operator  which  specifies  the  problem  we  want  to  solve,  S  :  F  —*  G 
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for  some  sets  F  and  G.  The  operator  5  carries  Im&th  fi'om  F  into  a  mathematical  output 
Oraath  in  G\  see  Figure  2(a) 


lm*th 

e- 


5 


OmAth 


Figure  2(a) 

Of  cotrrse,  this  is  too  general  to  characterize  an  IBC  problem.  For  example,  could 
be  the  locations  of  a  set  of  cities  and  Om^th  could  be  an  optimal  tour;  which  is  a  typical 
discrete  problem.  This  is  an  IBC  problem  when  Im«ti>  caimot  be  entered  into  a  digital 
computer,  and  it  must  be  replaced  by  a  computer  input  denoted  by  Icomp- 

The  computer  input,  Icompi  consists  of  a  finite  set  of  numbers.  For  example,  if  Imuh 
is  a  function  then  Icomp  might  consist  of  its  values  at  certain  points.  Icomp  Is  obtained 
from  Imath  by  information  operations.  Different  disciplines  have  different  names  for  these 
information  operations.  Computer  scientists  called  them  oracle  calls,  mathematicians  call 
them  functionals,  and  engineers  call  them  black-box  calls.  The  replacement  of  Imath  by 
Icomp  niay  be  viewed  as  a  discretization. 

Denote  the  set  of  information  operations  by  iV(Im*th);  we  call  N  the  information  opera¬ 
tor.  Since  many  (typically,  an  infinite  number  of)  mathematic2d  inputs  map  into  the  same 
computer  input,  the  mapping  N  is  many-to-one.  That  is,  discretization  is  irreversible.  The 
situation  is  diagrammed  in  Figure  2(b). 


Imath 


N 
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Ic 


omp 


Omath 


Figure  2(b) 


Although  there  has  been  mention  of  neither  computer  output  nor  algorithm,  we  can 
already  draw  certain  conclusions.  Since  iV  is  a  many-to-one  map,  the  computer  does  not 
know  the  mathematical  input.  Therefore,  it  is  impossible  to  solve  this  problem  exactly; 
the  best  we  can  hope  for  is  an  approximation. 

We  assign  the  same  cost  to  each  information  operation.  Given  an  error  threshold  e,  we 
can  define  the  information  complexity,  COMP'®*“(e),  as  the  minimal  cost  of  the  information 
operations  needed  to  obtain  an  £-approximation.  (In  computation2d  learning  theory  this 
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is  called  sample  complexity.)  Information  complexity  c£m  be  defined  in  different  settings 
such  as  the  worst  case,  average  case  or  probabilistic  setting. 

Using  the  concept  of  radius  of  information,  r{N),  see  TW  (80,  pp.  9-15],  TWW  (88, 
pp.  43-45,  197-200,  327-328],  we  can  often  obtain  sharp  lower  and  upper  bounds  on  the 
information  needed  to  get  an  £-approximation.  The  information  N  is  powerful  enough  to 
obtain  an  £- approximation  iff 

r{N)  <  £. 

Since  the  information  complexity  is  a  lower  boxmd  on  the  computational  complexity,  de¬ 
fined  below,  this  has  led  to  proven  (not  conjectured)  intractability  and  unsolvability  results 
which  we’ll  describe  in  Section  2. 

Because  of  the  basic  role  played  by  information-level  results  we  decided  to  name  this 
area  information-based  complexity.  This  level  typically  does  not  exist  for  discrete  problems. 
However,  combinatorial  issues  will  play  an  increasingly  important  role  in  IBC:  see  Section 
4. 

Let  the  computer  output  be  denoted  by  Ocomp  and  the  operator  that  maps  Icomp  into 
Oeomp  by  We  call  <f>  a  combinatory  algorithm  (algorithm  for  brevity).  Since  ^  maps 
the  computer  input  into  the  computer  output  it  plays  the  same  role  as  algorithm  does 
elsewhere  in  computer  science.  Figxire  2(c)  completes  the  picture. 


Figxire  2(c) 

Observe  that  Ocomp  9^  Om4th  because  N  is  many-to-one.  In  other  words,  S  does  not 
commute  with  <f>  composed  with  N. 

We  now  discuss  the  model  of  computation  used  in  IBC.  For  simplicity,  we  restrict 
ourselves  to  the  case  that  C?  =  R.  We  assume  that  the  real  number  model  is  chosen  as  our 
model  of  computation.  (See  Section  5  for  a  discussion  of  why  the  real  number  model  is 
often  used  in  IBC  zuid  zJso  of  research  on  finite  models.)  That  is,  we  assume  that  arithmetic 
operations  and  comparisons  on  real  numbers  are  carried  out  exactly  auid  at  unit  cost. 

We  define  the  combinatorial  complexity,  COMP‘^®‘"’’(£),  as  the  minimEd  cost  of  the  com¬ 
binatory  operations  needed  to  compute  an  £-approximation  if  all  information  operations 
were  free. 
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Finally,  we  define  the  computational  complexity,  COMP(e),  as  the  minimal  cost  of  com¬ 
puting  the  computer  output  witli  :rror  at  most  £  under  the  assumption  that  information 
and  combinatory  operations  are  charged. 

As  before,  combinatorial  and  computational  complexity  may  be  defined  in  the  worst 
case,  average  case  and  probabilistic  settings.  Note  that, 

C0MP(£)  >  max{COMP"*^‘’(£),COMP=°"*‘*(£)}. 

We  conclude  this  overview  by  characterizing  IBC  and  stating  its  major  goals.  IBC 
studies  problems  which  have  the  properties  listed  below. 

(1)  ^omp  ^  lmkth‘ 

(2)  There  is  a  charge  for  obtaining  Icomp* 

We  discuss  the  first  of  these.  These  are  two  major  reasons  why  Icomp  ^  Im»th-  The  first  is 
that  the  mathematical  input  cmmot  be  represented  by  a  finite  set  of  numbers.  We  say  the 
information  about  Imath  is  partial.  An  important  example  in  applications  is  when  Imath  is  a 
multivariate  fimction.  A  second  reason  is  that  the  information  about  Imkth  is  contaminated. 
Information  may  be  contaminated  because  of  round-off  or  measurement  errors. 

We  list  some  of  the  major  goals  of  IBC. 

(1)  Obtain  good  lower  and  upper  bounds  on  the  computational  complexity,  infonna- 
tion  complexity,  and  combinatorial  complexity. 

(2)  Find  information  N  and  an  algorithm  4  for  which  the  computational  complexity 
is  attained  or  nearly  attained.  Such  N  and  <i>  are  called  optimal,  or  nearly  optimal. 

We  summarize  the  reminder  of  this  article.  We  will  present  a  selection  of  recent  results 
from  a  number  of  IBC  areas.  We  then  conclude  tliis  article  with  a  discussion  of  similarities 
and  differences  with  discrete  complexity  and  a  brief  history.  An  abstract  formulation  of 
IBC  may  be  found  in  the  Appendix. 

2.  Breaking  Intractability 

It  has  been  established  that  in  the  worst  caise  deterministic  setting  many  problems 
studied  in  IBC  are  unsolvable  or  intractable.  More  precisely,  let  the  mathematical  input 
/  be  a  multivzuiate  function  of  d  variables.  Let  the  smoothness  of  the  set  of  inputs  be 
denoted  by  r.  For  example,  we  might  require  that  all  partial  derivatives  of  /  up  to  order  r 
exist  and  are  uniformly  bounded  by  1.  Assume  we  want  to  guarantee  an  error  at  most  e. 
Then,  for  many  continuous  problems  the  worst  case  computational  complexity,  COMP(£), 
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is  given  by 


COMP(c)  =  0 


(1) 


For  example,  multivariate  integration,  function  approximation,  partial  differential  equa¬ 
tions,  integral  equations,  and  nonlinear  optimization  all  have  this  computational  complex¬ 
ity,  see  Bakhvalov  [59],  Heinrich  [93],  Nemirovsky  and  Yudin  [83],  Novak  [88],  Pereverzev 
[89],  TWW  [88],  and  Werschulz  [91]. 

Furthermore,  many  problems  in  science,  engineering,  economics  and  even  finance  use 
mathematical  models  with  large  d.  For  example,  computational  chemistry,  computational 
design  of  pharmaceuticals,  and  computational  metadliu'gy  involve  computation  with  large 
number  of  particles.  Since  the  specification  of  each  particle  requires  three  variables  for 
static  problems  and  six  variables  for  dynamic  problems,  this  leads  to  problems  with  very 
large  d.  For  path  integrals,  important  in  the  foimdation  of  physics,  d  =  -f-oo;  they  invite 
approximation  by  multivariate  integration  with  huge  d.  Problems  with  lju-ge  d  are  also 
important  in  mathematical  disciplines  such  as  statistics  and  geometry. 

Observe  that  we  can  conclude  that  if  the  smoothness  r  is  fixed  and  positive  then  the 
computational  complexity  is  an  exponential  function  in  d.  Thus,  problems  whose  complex¬ 
ity  is  governed  by  (1)  are  intractable  in  d.  If  r  =  0,  that  is,  if  the  class  of  inputs  is  only 
continuous,  then  COMP(£)  =  -foo  for  small  e;  that  is,  the  problem  is  umolvahlt. 

The  only  way  to  break  unsolvability  or  intractability  is  to  weaken  the  assur2mce  of  an 
^-approximation  by  shifting  to  another  setting.  Three  settings  have  been  used  for  trying  to 
break  intractability:  randomized,  average  case,  and  probabilistic  settings.  Here  we  confine 
ourselves  to  recent  advances  on  breaking  intractability  in  the  average  case  setting.  See 
TW  [91a]  for  a  survey  of  how  to  break  intractability  in  the  randomized  setting. 

We  describe  recent  advances  in  breaking  intractability  for  multivariate  integration  and 
multivariate  function  approximation.  Multivariate  integration  is  especially  common  since 
computing  the  expectation  of  smy  stochastic  process  leads  to  this  problem. 

In  the  average  csuse  setting  the  average  computational  complexity,  COMP*''*(£),  is  de¬ 
fined  as  the  minimal  expected  cost  such  that  the  average  error  is  less  than  £.  One  has  to 
put  a  measure  on  the  space  of  inputs.  Although  for  discrete  problems  one  can  zussume  that 
ail  inputs  are  equiprobable,  no  such  assumption  can  be  made  for  typicad  sets  of  functions. 
The  most  commonly  used  meausures  on  function  spaces  are  Gaussian  measures,  and,  in 
particular,  Wiener  meausures  which  are  a  speciad  kind  of  Gaussian  measure. 

It  wats  known  that  multivariate  integration  is  traictable  on  the  average  but  the  proof  is 
non-constructive.  That  is,  the  optimal  points  at  which  the  integrand  should  be  evaluated 
and  the  average  computationad  complexity  were  irnknown. 
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Then  W  [91]  established  a  relation  between  discrepancy  and  the  average  complexity  of 
multivariate  integration.  Discrepamcy  has  been  extensively  studied  in  number  theory  and 
sharp  bounds  on  discrepancy  in  d  dimensions  were  established  by  Roth  [54,80].  The  use 
of  the  results  from  discrepancy  theory  solved  the  multivariate  integration  problem. 

We  describe  the  results  more  precisely.  Let  r  =  0.  Recall  that  in  the  worst  case 
deterministic  setting  the  problem  is  unsolvable.  Assume  the  measure  on  the  integrands  is 
the  Wiener  sheet  measure.  Then 

(l  ( 

comp-«(£)  = 


Thus  a  problem  which  is  worst-case  imsolvable  becomes  tractable  on  the  average.^  Either 
Hammersley  points  or  hyperbolic-cross  points  aie  nearly  optimal  as  the  evaluation  points 
in  d  dimensions.  These  results  were  generalized  to  the  case  of  smooth  inputs  by  Paskov 
[93]. 

We  turn  to  the  average  complexity  of  function  approximation.  This  is  particularly 
important  since  unlike  for  multivariate  integration,  it  is  known  that  rzmdomization  does 
not  help  for  fimction  approximation,  see  Wasilkowski  [88b],  NovzJc  [92],  Again,  let  r  =  0 
and  assume  a  Wiener  sheet  measure.  Then 


/l/  1\ 

comp-«(£)  =  e  f  ^  (^log-j 


and  agmn  hyperbolic  cross-points  can  be  used;  see  W  [92b]. 

Roth’s  discrepzincy  results  zmd  the  average  computational  results  quoted  above  are  big 
theta  resiilts  in  e.  That  is,  the  dependence  on  e  is  known,  but  there  is  a  multiplicative 
factor,  g{d),  which  is  not  known.  If  we’re  serious  about  solving  problems  with  large  d  we 
must  be  able  to  bound  g{d).  It  is  believed  that  obtaining  good  theoreticed  estimates  of 
g{d)  is  very  hard. 

The  problem  may  be  solved  by  getting  rid  of  the  factor  g(d)  in  the  following  way,  W  [93]. 
A  problem  is  said  to  be  strongly  tractable  if  the  number  of  information  operations,  m(£,  d), 
needed  to  compute  an  e- approximation  is  independent  of  d  and  depends  polynomially  on 
1/e,  that  is,^ 

m(e,d)  <  K  ,  Vd,  Ve  <  1, 

*By  tractable  (in  1/e)  we  mean  that  the  complexity  is  bounded  by  K{d){l/eY  for  all  d  and  e  <  1  for 
a  number  p  which  is  independent  of  d  and  e. 

®More  precisely,  it  is  required  that  the  computational  complexity  can  be  bounded  by  K  c{d)  (1/c)’’  for 
certain  numbers  K  and  p,  independent  of  d  and  e,  where  c(d)  is  the  cost  of  one  information  evaluation  of 
a  function  of  d  variables. 
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for  certain  numbers  K  and  p. 

That  might  seem  to  much  to  expect  but  multivariate  integration  and  multivairiate  ap¬ 
proximation  are  both  strongly  tractable  on  the  average^  and  it  is  sufficient  to  take  the 
information  operations  as  function  evaluations,  W  [93].  Usually  in  computational  com¬ 
plexity,  an  upper  boimd  is  given  by  an  algorithm  and  a  lower  bound  by  a  theorem.  But 
in  this  case,  the  upper  bound  has  been  determined  by  a  theorem  and  is  non-consiruciive. 
That  is,  we  know  that  there  must  exist  sample  points  at  which  we  shovild  evaluate  the 
function  and  a  combinatory  algorithm  which  make  multivariate  integration  and  approxi¬ 
mation  strongly  tractable.  The  construction  of  such  sample  points  and  algorithm  is  being 
studied;  WW  [94]. 

Due  to  the  relation  between  discrepancy  and  average  case  multivariate  integration, 
strong  tractability  for  multivariate  integration  implies  that  the  discrepancy  of  n  points 
in  d  dimensions  can  be  botmded,  independently  of  d,  by  Kn~^  with  the  same  K  and 
p  for  both  problems.  This  estimate  is  of  interest  in  its  own  right  since  discrepancy  is 
of  considerable  interest  in  number  theory,  see  Beck  and  Chen  [87],  and  Niederreiter  [92]. 
Furthermore  there  are  numerous  applications  of  discrepancy;  for  example,  for  applications 
in  computer  graphics,  see  Dobkin  and  Mitchell  [93]. 

3.  Verification 

Most  of  IBC  has  been  devoted  to  the  computational  complexity  of  computing  an  e- 
approximation.  Recently,  the  computational  complexity  of  verification  has  been  studied, 
that  is  checking  whether  an  answer  is  correct,  see  W  [92a].  In  addition  to  being  given  a 
problem,  we  are  also  given  an  “answer”  g  and  asked  whether  it  is  true  that  g  is  within  e 
of  the  mathematical  output;  see  the  Appendix  for  a  precise  definition. 

The  reader’s  reaction  may  be  that,  of  course,  verification  is  no  harder  than  computation. 
Indeed,  if  the  mathematical  output  can  be  computed  exactly  at  finite  cost,  as  is  the  case  for 
discrete  problems,  then  with  one  extra  compsurison  one  can  solve  the  verification  problem. 

However,  for  typical  IBC  problems  the  mathematical  output  cannot  be  computed  with 
finite  cost,  and  the  relation  between  verification  and  computation  is  not  obvious.  As  we 
shall  see,  in  the  worst  case  setting  verification  may  be  unsolvable  while  the  corresponding 
computational  problem  is  easy. 

We  illustrate  this  with  a  simple  example.  The  computational  problem  is  to  compute 
an  e-approximation  to  f{x)  dx  where  the  mathematical  input  /  is  an  arbitrary  function 

*We  stress  that  this  holds  for  the  Wiener  sheet  measure.  For  an  isotropic  Wiener  measure,  function 
approximation  is  still  intractable  even  on  the  average,  see  Wasilkowski  [93]. 
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over  [0, 1]  satisfying  a  Lipschitz  condition  with  constant  at  most  one.  The  computational 
input  is  given  by  values  of  /  at  some  points.  The  computational  complexity  in  the  worst 
case  setting  is  known  to  be  of  order  l/s;  thus  the  computational  problem  is  “easy”. 

Suppose  now  that  we’re  given  the  purported  answer  g  and  asked  to  check  whether  this 
is  within  e  of  the  integral  of  /.  We  show  that  the  verification  problem  is  unsol vable. 

Suppose  that  we  compute  /  at  a  finite  niunber  of  points  Xi  and  that  for  every  such 
point  f(xi)  =  flr  +  €.  If  we  answer  NO  the  adverssury  will  choose  /(x)  =  g  +  e.  This 
function  is  certainly  Lipschitz  (with  constant  zero),  and  compatible  with  the  computed 
function  values.  Since  f(x)dx  =  g  +  e  is  within  e  of  the  answer  g,  we  made  a  mistake 
by  answering  NO. 

If  we  answer  YES  the  adversary  will  choose  a  hat  function  /  going  through  the  points 
(xi,g  +  €)  and  with  Lipschitz  constant  one.  Clearly,  f{x)dx  >  g  +  c  which  is  not  within 
e  of  the  answer  g.  We  made  a  mistake  by  answering  YES.  Hence,  as  long  as  we  have  finitely 
many  function  values,  there  is  no  way  to  solve  the  verification  problem  in  the  worst  case 
setting. 

It  can  be  shown  that  verification  for  IBC  problems  is  often  unsolvable  in  the  worst 
case  setting.  Verification  is  therefore  studied  in  the  probabilistic  setting.  Here  we  want  to 
verify  that  ^  is  an  ^-approximation  with  confidence  S;  see  the  Appendix.  In  this  setting 
the  probabilistic  complexity  of  verification  depends  on  how  e  and  S  are  related.  Any 
relation  between  the  probabilistic  complexities  of  verification  and  computation  is  possible. 
In  particular,  verification  can  be  exponentially  (in  6)  harder  than  computation. 

NW  [92]  studied  relaxed  verification  in  the  worst  case  setting.  That  is  the  answers  can 
be  YES,  NO,  or  DON’T  CARE.  The  size  of  the  DON’T  CARE  region  is  specified  by  a 
parameter  a;  see  the  Appendix.  For  a  positive  a,  the  worst  case  complexity  of  relaxed 
verification  is  finite.  It  is  related  to  the  worst  case  complexity  of  the  computational  problem 
with  £  replaced  by  roughly  e  with  q  €  (0, 1]  depending  on  the  problem.  Hence,  if  a  is  not 
too  small,  the  complexity  of  relaxed  verification  is  roughly  comparable  to  the  complexity 
of  the  computational  problem.  If,  however,  a  is  small  then  the  complexity  of  relaxed 
verification  is  usually  much  larger  than  the  complexity  of  the  computational  problem. 

4.  Combinatorial  Complexity 

To  date,  IBC  problems  have  usually  been  proven  unsolvable  or  intractable  by  showing 
that  their  information  complexity  was  infinite  or  exponential.  Recent  results  establish 
unsolvability  or  intractability  by  showing  that  the  combinatorial  complexity  is  infinite  or, 
if  P^^^NP,  not  polynomial.  We  report  these  results  and  also  pose  an  open  question. 
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Papadimitriou  and  Tsitsiklis  [86]  is  a  pioneering  paper  which  proves  that  a  nonlinear 
problem  in  decentralized  control  theory  is  intractable  if  P^NP.  More  precisely,  the  infor¬ 
mation  complexity  is  a  polynomial  in  Ife  but  the  combinatorial  complexity  in  a  Turing 
macliine  model  of  computation  is  not  polynomial  in  1/e,  if  P^NP. 

WW  [93]  show  that  there  exists  a  linear  problem  whose  information  complexity  is  a 
polynomi2d  in  1/e  but  whose  combinatorial  complexity  is  infinite’,  making  the  problem 
unsolvable.  An  “artifici2d”  problem  is  constructed  to  show  that  even  a  linear  problem  can 
be  very  hard  combinatorially.  Chu  [94]  shows  that  the  combinatorial  complexity  can  be 
any  increasing  function  of  the  information  complexity. 

We  pose  an  open  question.  So  far,  tight  bounds  on  the  computational  complexity  of 
IBC  problems  are  achieved  when  the  minimal  amount  of  information  is  used.  Is  there 
a  problem  for  which  more  information  operations  should  be  used  to  achieve  the  compu¬ 
tational  complexity?  That  is,  does  there  exist  a  problem  for  which  the  minimal  amoimt 
of  information  is  very  hard  to  combine  but  if  more  information  operations  are  computed 
then  it  is  easier  to  combine  them  and  the  total  cost  of  computing  an  e-approximation  is 
minimized  in  the  latter  case. 

We  believe  that  in  the  future,  progress  in  IBC  will  increasingly  require  re  sults  in  both 
information  complexity  and  combinatorial  complexity. 

5.  Similarities  and  Differences  with  Discrete  Complexity 

We  begin  with  similarities.  As  in  the  rest  of  computationed  complexity,  IBC  studies 
lower  and  upper  bounds  on  the  computational  difficulty  of  solving  mathematically  posed 
problems.  Optimal  and  near-optimal  algorithms  are  sought.  To  attempt  to  break  the 
intractability  resxUts  and  conjectures  of  the  worst  case  deterministic  setting,  both  IBC 
and  discrete  complexity  tiuned  to  other  settings  such  as  the  randomized  and  average  case 
settings. 

There  are  also  significant  contrasts,  three  of  which  we  will  discuss  in  the  remainder  of 
the  section.  IBC  h2is  the  following  characteristics: 

Problems  cannot  be  exactly  solved 

Intractability  has  been  proven  for  many  problems 

Real  munber  model  usually  used 

We  discuss  each  of  these. 

’This  result  holds  if  we  allow  arithmetic  operations,  comparisons  of  real  numbers,  and  precomputation. 
It  is  open  if  there  exists  a  linear  problem  with  finite  information  complexity  and  infinite  combinatorial 
complexity  in  the  extended  real  number  model  in  which  logarithms,  exponentials  and  ceilings  are  allowed. 
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ProbUnu  Cannot  Be  Exactly  Solved 

As  discussed  in  Section  1,  it  is  impossible  to  solve  IBC  problems  exactly  because  Icomp  7^ 
It  is  possible,  in  principle,  to  solve  discrete  problems  exactly  although  one  may 
choose  to  solve  them  approximately  to  reduce  the  cost. 

Intractability  has  been  proven  for  many  problems 

Using  information-level  arguments,  imsolvability  and  intractability  has  been  established 
for  many  IBC  problems.  With  only  a  few  exceptions,  there  are  no  non-trivial  lower  boimds 
on  the  combinatorial  complexity  of  IBC  problems.  Since  only  combinatorial  argiunents  are 
avsdlable,  intractability  of  many  discrete  problems  has  been  conjectured.  (Of  course,  lower 
bounds,  as  well  as  unsolvability  results,  have  been  established  for  some  combinatorial 
problems.) 

Real  number  model  usually  used 

To  date,  the  real  munber  model  of  computation  has  usually  been  used  in  continuous 
computational  complexity.  After  discussing  the  motivation,  we  turn  to  finite  models  for 
continuoiis  computational  complexity. 

Scientific  problems  are  usually  solved  using  fixed  precision  floating  point  arithmetic. 
The  cost  of  floating  point  operations  and  comparisons  is  independent  of  the  size  of  the 
operands.  Furthermore,  all  arithmetic  operations  and  comparisons  cost  about  the  same  to 
execute.  Our  goal  is  to  choose  a  model  of  computation  that  corresponds  to  performance  of 
a  digital  computer  executing  floating  point  arithmetic.  The  abstraction  we  choose  is  the 
real  number  model,  which  assumes  that  aritlimetic  and  comparisons  on  real  numbers  can 
be  executed  exactly  and  at  unit  cost.  (The  choice  of  unit  cost  is  just  scaling.)  Rounding 
errors  occur  when  a  digital  computer  executes  operations  in  fixed  precision  floating  point 
arithmetic.  In  our  abstraction  we  assume  arithmetic  is  performed  without  error.  This 
separation  of  complexity  theory  from  error  analysis  is  done  for  technical  reasons;  compu¬ 
tational  complexity  theory  is  hard  enough  without  including  rotmd-off  error.  When  an 
interesting  new  algorithm  is  discovered  from  computational  complexity  considerations,  a 
stability  analysis  in  fixed  preci.sion  floating  point  arithmetic  must  be  ijerformed. 

We  stress  that  the  real  number  model  is  not  polynomially  equivalent  to  the  Tui.  .  g 
machine  model.  For  example,  TW  [82]  shows  that  the  cost  of  Kachian’s  algorithm  is 
not  polynomial  in  the  real  number  model  and  conjecture  that  linear  programming  is  not 
polynomial  in  tliis  model.  This  conjecture  is  still  open. 

Several  finite  models  of  computations  have  also  been  mialyzed.  One  of  them  is  a  model 
based  on  recmsive  analysis,  see  Ko  [91]. 

In  the  bit  model  it  is  assumed  that  one  can  get  a  rational  binary  approximation  of  a 
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real  number  or  of  a  function  value  to  within  any  accuracy  with  the  cost  depending  on  the 
number  of  bits.  This  model  has  been  studied  for  problems  with  complete  information,  for 
instance,  for  finding  roots  of  polynomials,  see  Schonhage  [86|.  A  mixed  model,  in  wliich  the 
bit  model  is  used  for  information  operations,  and  the  real  number  model  for  combinatory 
operations,  is  utilized  by  Kacewicz  and  Plaskota  [90]  to  analyze  certain  IBC  problems. 

It  is,  of  course,  desirable  to  fully  explore  finite  models  for  IBC  problems  and  we  believe 
this  to  be  an  important  direction  for  future  research. 

6.  A  Brief  History 


We  present  a  very  brief  history  of  IBC.  Research  in  the  spirit  of  IBC  was  initiated  in 
the  Soviet  Union  by  Kolmogorov  in  the  late  40’s.  Nikolskij  [50],  a  student  of  Kolmogorov, 
studied  optimal  quadrature.  This  line  of  research  was  greatly  advanced  by  Bakhvalov,  see 
e.g.,  Bakhvalov  [59,  71).  In  the  United  States  research  in  the  spirit  of  IBC  was  initiated 
by  Sard  [49]  and  Kiefer  [53].  Kiefer  reported  the  results  of  his  1948  MIT  Master’s  Thesis 
that  Fibonacci  sampling  is  optimal  when  approximating  the  maximum  of  a  unimodal 
function.  Sard  studied  optimal  quadrature.  Golomb  and  Weinberger  [59]  studied  optimal 
approximation  of  linear  functionals.  Schoenberg  [64]  realized  the  close  connection  between 
splines  and  algorithms  optimal  in  the  sense  of  Seurd. 

T[61,64]  initiated  the  study  of  iterative  computational  complexity,  emphasizing  the 
centrzd  role  of  information.  Maximal  order  results,  needed  to  obtain  lower  bounds  on 
computational  complexity,  were  obtained  for  scalar  nonlinear  equations.  W  [75j  introduced 
the  concept  of  order  of  information  in  an  abstract  space  which  provides  a  genereJ  tool  for 
establishing  maximal  order  of  an  algorithm. 

Micchelli  zmd  Rivlin  [77]  studied  optimal  recovery  and  considered  optimal  error  algo¬ 
rithms  for  the  approximation  of  linear  operators.  Linear  noisy  information  was  permitted. 

A  general  formulation  of  IBC,  primarily  in  the  worst  case  deterministic  setting,  is  pre¬ 
sented  in  TW  [80],  where  a  somehow  more  detailed  history  and  zm  amnotated  bibliography 
of  over  300  papers  and  books  up  to  1979  can  be  adso  found.  At  the  time  IBC  was  called 
ajialytic  complexity  to  differentiate  it  from  algebreuc  complexity.  TWW  [88]  extend  the 
study  of  IBC  to  numerous  settings  including  average  case,  randomized,  probabilistic,  and 
asymptotic  settings,  as  well  as  mixed  settings. 
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Appendix 

We  present  an  abstract  formvilation  of  IBC.  Let 

5:F-G 

where  F  is  a  subset  of  a  linear  space  and  G  is  a  normed  linear  space. 

For  /  €  F,  we  wish  to  compute  an  approximation  to  5(/).  To  do  this  we  must  know 
something  about  /.  A  basic  assumption  is  that  we  have  only  partial  information  about 
/,  We  gather  this  psu'tial  information  about  /  by  information  operations  L(f).  Here  we 
will  assume  that  £  is  a  linear  functional.  Let  A  denote  the  class  of  information  operations 
we  will  permit.  The  choice  of  A  will  depend  on  the  problem  we  wish  to  solve.  If  we  wish 
to  approximate  a  definite  integral  we  must  exclude  definite  integration  as  a  permissible 
information  operation,  and  for  this  problem  A  is  usually  defined  as  the  cleiss  of  function 
evaluations.  For  other  problems,  such  as  the  solution  of  nonlineeir  equations,  we  may 
permit  any  linear  functional.  Let 

for  £,  €  A.  Here  £,,  as  well  as  n,  can  be  adaptively  chosen  depending  on  the  already 
computed  information  operations. 

N(f)  is  called  the  information  on  /  and  N  the  information  operator.  The  motivation 
for  introducing  the  information  operator  N  is  to  replace  the  element  /,  which  is  often  from 
an  infinite-dimensional  space,  by  n  munbers.  An  idealized  algoritlmi^  ^  is  sin  operator 
4> :  N{F)  G.  The  approximation  U{f)  is  then  computed  by 

U{f)  =  <f>{N{f)). 

(The  assumption  that  the  approximation  is  the  composition  of  (j>  with  N  is  made  without 
loss  of  generality.)  We  seek  (/(/)  such  that 

l|5(/)-CA(/)||<e. 

We  say  U{f)  is  an  e- approximation. 

We  illustrate  the  abstract  model  by  an  integration  example  with 

s{f)  =  ['  fm. 

Jo 

*By  using  such  a  general  definition  of  algorithm,  we  strengthen  the  lower  bound  conclusions.  For  upper 
bounds,  we  restrict  the  algorithms  to  those  constructed  from  permissible  combinatory  operations. 
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fee' (0,1)  and  ||/L„  <  1}  , 

and  G  as  the  set  of  real  nvunbers.  The  functionals  are  chosen  as  L,{f)  =  /(ti).  An  example 
of  an  algorithm  is 

W)  =  = 

tsl 

To  define  computational  complexity  we  must  first  introduce  our  model  of  computation, 
which  is  defined  by  two  postulates: 

(i)  Let  denote  the  set  of  permissible  combinatory  operations  including  the  addition 
of  two  elements  in  G,  multiplication  by  a  scalar  in  G,  arithmetic  operations  on 
re2d  numbers,  and  comparison  of  real  numbers.  We  assume  that  each  combinatory 
operation  is  performed  exactly  with  unit  cost. 

(ii)  We  assume  that  we  are  charged  for  each  information  operation.  That  is,  for  every 
£  6  A  and  f  e  F,  the  computation  of  £(/)  costs  c,  where  c  >  0.  Typiczilly,  c  >•  1. 

We  assume  the  real  number  model,  that  is,  we  czm  perform  ojierations  on  real  numbers 
exactly  and  at  unit  cost.  See  Section  5  for  a  discussion  and  motivations  imderlying  the 
model  of  computation  and  the  real  number  model. 

We  briefly  describe  how  the  computation  is  carried  out  and  how  its  cost  is  calculated.  Let 
cost(iV,  /)  denote  the  cost  of  computing  the  information  N{f),  Knowing  the  information 
N{f),  the  approximation  U{f)  =  <f>{N{f))  is  computed  by  combining  the  information  to 
produce  an  element  of  G  which  approximates  S{f). 

Let  cost(^,  iV(/))  denote  the  cost  of  computing  U{f)  =  <i>{N{f)),  given  N{f).  Then 
the  total  cost  of  computing  U{f),  cost(C/’,/),  is 

cost(t/,  /)  =  cost(i\r,  /)  +  cost  (^,  N  (/)) . 

We  are  ready  to  define  the  computational  complexity,  comp(e),  as 

comp(£)  =  inf  {cost  ([/) :  U  such  that  e{U)  <  e} , 

with  the  convention  that  inf  0  =  oo.  The  definition  of  cost(17)  and  c(l7)  varies  according 
to  the  setting.  Settings  studied  in  IBC  include  worst  case,  average  cause,  probabilistic, 
randomized  and  asymptotic.  Mixed  settings  are  also  studied.  We  confine  ourselves  here 
to  the  definition  of  just  the  worst  case  and  average  ceise  settings. 

Worst  Case  Setting:  The  worst  case  error  and  worst  case  cost  of  U  are  defined  by 

c(l^)  =  sup||5(/)-t/(/)||, 
feF 

cost(tf )  =  sup  cost  {U,  /). 

/€F 
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Average  Case  Setting:  Let  /i  be  a  probability  measure  defined  on  F.  The  average 
case  error  and  average  case  cost  of  U  are  defined  by 

<U)  =  J -  uu)f  Kd/). 

cost(f/')  =  J  cost  {L\f)n{df). 

The  concept  of  complexity  permits  us  to  introduce  the  fundamental  concepts  of  optimal 
information  and  optimal  algorithm.  Information  N  amd  an  algorithm  (j)  that  uses  N  are 
called  optimal  information  and  optimal  algorithm,  respectively,  iff  If  =  <l>  •  N  satisfies 
cost(tf)  =  comp(e)  and  e{U)  <  e. 

We  define  the  verification  problem.  For  given  g  €  G  vfe  wamt  to  check  whether  ||5(/)  — 
illl  <  £.  That  is,  we  define  YER( f,g)  =YES  if  ||5(/)  -  <7!!  <  £,  and  VER(/,i7)  =N0 
otherwise.  In  the  worst  case  setting,  we  wish  to  find  an  approximation  operator  U  such 
that 

Uif,g)  ^YER{ f,g)  ^  f  €  F,  g  €  G. 

In  the  probabilistic  setting,  we  assume  that  the  set  F  is  equipped  with  a  probability 
measure  p.  For  a  given  confidence  p2U'ameter  6  €  (0, 1],  we  wish  to  find  an  approximation 
operator  U  such  that 

/i{/  €  F;  UU,9)  =  VER(/,  <7)}  >  1  -  Vj7  €  G. 

For  relzixed  verification,  we  assume  that  a  €  (0, 1]  and  we  redefine  VER(/,  g)  as  follows. 
We  set  VER(/,  <7)  =YES  if  ||5(/)  -  y||  <  £,  VER(/,  <7)  =NO  if  ||5(/  -  «7||  >  (1  +  a)e,  and 
VER(/,  g)  =DON’T  CARE,  otherwise. 

The  complexity  of  verification  or  relaxed  verification  is  defined  similarly  Jis  for  compu¬ 
tational  problems,  that  is,  by  minimizing  the  cost  of  computing  U  that  solves  the  corre¬ 
sponding  verification  problem. 
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Breaking  Intractability 

Problems  that  would  otherwise  be  impossible 
to  solve  can  now  be  computed,  as  long  as  one 
settles  for  what  happens  on  the  average 

by  Joseph  F.  Traub  and  Henryk  Wozniakowski 


Although  mathematicians  and  sci¬ 
entists  must  rank  among  the 
Lmost  rational  people  in  the 
world,  they  will  often  admit  to  falling 
prey  to  a  curse.  Called  the  curse  of  di¬ 
mension,  it  is  one  many  people  experi- 
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ence  in  some  form.  For  example,  a  fam¬ 
ily's  decision  about  whether  to  refinance 
their  mortgage  with  a  IS-  or  30-year 
loan  can  be  extremely  difficult  to  make, 
because  the  choice  depends  on  an  in¬ 
terplay  of  monthly  expenses,  income, 
future  tax  and  interest  rates 
and  other  uncertainties.  In  sci¬ 
ence,  the  problems  are  more 
esoteric  and  arguably  much 
harder  to  cope  with.  In  the 
computer-aided  design  of 
phannaceuticals,  for  instance, 
one  might  need  to  know  how 
tightly  a  drug  candidate  will 
bind  to  a  biolo^al  receptor. 
Assuming  a  typical  numbv  of 
8,000  atoms  in  the  drug,  the 
biological  receptor  and  the 
solvent,  then  because  of  the 
three  spatial  variables  needed 
to  describe  the  position  of 
each  atom,  the  calculation  in¬ 
volves  24,000  variables.  Sim¬ 
ply  put,  the  more  variables,  or 
dimensions,  there  are  to  con¬ 
sider,  the  harder  it  is  to  ac¬ 
complish  a  task.  For  many 
problems,  the  difficulty  grows 
exponentially  with  the  nmnber 
of  variables. 

The  curse  of  dimension  can 
elevate  tasks  to  a  level  of  diffi¬ 
culty  at  which  they  become  in¬ 
tractable.  Even  though  scien¬ 
tists  have  computers  at  their 
disposal,  problems  can  have 
so  many  vari^les  that  no 
future  increase  in  computer 
speed  will  make  it  possible  to 
solve  them  in  a  reasonable 
amount  of  time. 

Can  intractable  problems  be 
made  tractable— that  is,  solv¬ 
able  in  a  relatively  modest 
amount  of  computer  time? 
Sometimes  the  answer  is,  hap¬ 
pily,  yes.  But  we  must  be  will¬ 
ing  to  do  without  a  guarantee 
of  achieving  a  small  error  in 
our  calculations.  By  settling 
for  a  small  error  most  of  the 
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time  (rather  than  always),  some  kinds 
of  multivariate  problems  become  trac¬ 
table.  One  of  us  (Wozniakowski)  for¬ 
mally  proved  that  such  an  approach 
worlu  for  at  least  two  classes  of  math¬ 
ematical  problems  that  arise  quite  fre¬ 
quently  in  scientific  and  en^eering 
tasks.  The  first  is  integration,  a  funda¬ 
mental  component  of  the  calculus.  The 
second  is  surface  reconstruction,  in 
which  pieces  of  information  are  used 
to  reconstruct  an  object,  a  technique 
that  is  the  basis  for  medical  imaging. 

Fields  other  than  science  can  benefit 
from  ways  of  breaking  intractability. 
For  example,  financial  institutions  often 
have  to  assign  a  value  to  a  pool  of  mort¬ 
gages,  vdiich  is  affected  by  mortgagees 
who  refinance  their  loans,  we  assume 
a  pool  of  30-year  mortgages  and  per¬ 
mit  refinancing  monthly,  then  this  task 
contains  30  years  times  12  months,  or 
360  variables.  Adding  to  the  difficulty 
is  that  the  value  of  the  pool  depends 
on  interest  rates  over  the  next  30  years, 
which  are  of  course  unknown. 

We  shall  describe  the  causes  of  in¬ 
tractability  and  discuss  the  techniques 
that  sometimes  allow  us  to  break  it. 
This  issue  belongs  to  the  new  field  of 
information-based  complexity,  which 
examines  the  computational  complexi¬ 
ty  of  problems  that  cannot  be  solved 
exactly.  We  shall  also  speculate  briefly 
on  how  information-based  complexity 
might  enable  us  to  prove  that  certain 
scientific  questions  can  never  be  an¬ 
swered  because  the  necessary  comput¬ 
ing  resources  do  not  exist  in  the  uni¬ 
verse.  If  so,  this  condition  would  set  lim¬ 
its  on  what  is  scientifically  knowable. 

Information-based  complexity  fo¬ 
cuses  on  the  computational  diffi¬ 
culty  of  so<alled  continuous  prob¬ 
lems.  Calculating  the  movement  of  the 
planets  is  an  example.  The  motion  is 
governed  by  a  system  of  ordinary  dif¬ 
ferential  equations— that  is,  equations 
that  describe  the  positions  of  the  plan¬ 
ets  as  a  function  of  time.  Because  time 
can  take  any  real  value,  the  mathemati- 


^  cal  model  is  said  to  be  continuous.  Con¬ 

tinuous  problems  are  distinct  from  dis¬ 
crete  problems,  such  as  difference  equa¬ 
tions  in  which  time  takes  only  integer 
values.  Difference  equations  appear  in 
such  analyses  as  the  predicted  number 
of  predators  in  a  study  of  predator- 
prey  populations  or  the  anticipated  pol¬ 
lution  levels  in  a  lake. 

In  the  everyday  process  of  doing  sci¬ 
ence  and  engineering,  however,  contin¬ 
uous  mathematical  formulations  pre¬ 
dominate.  They  include  a  host  of  prob- 
*  lems,  such  as  ordinary  and  partial  dif¬ 

ferential  equations,  integral  equations, 
linear  and  nonlinear  optimization,  inte- 
«  gration  and  surface  reconstruction. 

These  formulations  often  invoKe  a  large 
number  of  variables.  For  example,  com¬ 
putations  in  chemistry,  pharmaceutical 
design  and  metallurgy  often  entail  cal¬ 
culations  of  the  spatial  positions  and 
momenta  of  thousands  of  particles. 

Often  the  intrinsic  difficulty  of  guar¬ 
anteeing  an  accurate  numerical  solu¬ 
tion  grows  exponentially  with  the  num¬ 
ber  of  variables,  eventually  making  the 
problem  computationally  intractable. 
The  growth  is  so  explosive  that  in  many 
cases  an  adequate  numerical  solution 
cannot  be  guaranteed  for  situations 
comprising  even  a  modest  number  of 
variables. 

To  state  the  issue  of  intractability 
more  precisely  and  to  discuss  possible 
cures,  we  will  consider  the  example  of 
computing  the  area  under  a  curve.  The 
process  resembles  the  task  of  comput¬ 
ing  the  vertical  area  occupied  by  a  col¬ 
lection  of  books  on  a  shelf.  More  explic¬ 
itly,  we  will  calculate  the  area  between 
two  bookends.  Without  loss  of  general¬ 
ity,  we  Ccm  assiune  the  bookends  rest 
-  at  0  and  1.  Mathematically,  this  sum¬ 
ming  process  is  called  the  computation 
of  the  definite  integral.  (More  accurate¬ 
ly,  the  area  is  occupied  by  an  infinite 
number  of  books,  each  infinitesimally 
thin.)  The  mathematical  input  to  this 
problem  is  called  the  integrand,  a  func¬ 
tion  that  describes  the  profile  of  the 
books  on  the  shelf. 

Calculus  students  learn  to  compute 
the  definite  integral  by  following  a  set 
of  prescribed  rules.  As  a  result,  the  stu¬ 
dents  arrive  at  the  exact  answer.  But 
most  integration  problems  that  arise  in 
practice  are  far  more  complicated,  and 
the  symbolic  process  learned  in  school 
cannot  be  carried  out.  Instead  the  inte¬ 
gral  must  be  approximated  numerical¬ 
ly— that  is,  by  a  computer.  More  exactly, 
one  computes  the  integrand  values  at 
finitely  many  points.  These  integrand 
values  result  finm  so-caUed  Information 
operations.  Then  one  combines  these 
values  to  produce  the  answer. 

Knowing  only  these  values  does  not 


JOSEPH  F.  TRAUB  and  HENRYK  WOZNIAKOWSKl  have  been  collaborating  since  1973. 
Currently  the  Edwin  Howard  Arrastrong  Professor  of  Computer  Science  at  Columbia 
University,  Traub  headed  the  computer  science  department  at  Caunegie  Mellon  Univer¬ 
sity  and  was  founding  chair  of  the  Computer  Science  and  Telecommunications  Board  of 
the  National  Academy  of  Sciences.  In  1959  he  began  his  pioneering  research  in  what  is 
today  called  information-based  complexity  and  has  received  many  honors,  including 
election  to  the  National  Academy  of  Engi^ring.  He  is  grateful  to  researchers  at  the 
Santa  Fe  Institute  for  numerous  stimulating  conversations  concerning  the  limits  of  sci¬ 
entific  knowledge.  Wozniakowski  holds  two  tenured  appointments,  one  at  the  Universi¬ 
ty  of  Warsaw  and  the  other  at  Columbia  University.  He  directed  the  department  of 
mathematics,  computer  science  and  mechanics  at  the  University  of  Warsaw  and  was  the 
chairman  of  Solidarity  there.  In  1988  he  received  the  Mazur  Prize  from  the  Polish  Math¬ 
ematical  Society.  The  authors  thank  the  National  Science  Foundation  and  the  Air  Force 
Office  of  Scientific  Research  for  their  support. 


completely  identify  the  true  integrand. 
Because  one  can  evaluate  the  integrand 
only  at  a  finite  number  of  points,  the  in¬ 
formation  about  the  integrand  is  par¬ 
tial.  Therefore,  the  integral  can,  at  best, 
only  be  approximated.  One  typically 
spedfies  the  accuracy  of  the 
approximation  by  stating  that 
the  error  of  the  answer  falls 
within  some  error  threshold. 
Mathematicians  represent  this 
error  with  the  Greek  letter  ep¬ 
silon,  e. 

Even  this  goal  cannot  be 
achieved  without  fiirther  re- 
strictioa  Knowing  the  inte¬ 
grand  at,  say,  0.2  and  0.5  indi¬ 
cates  nothing  about  the  curve 
between  those  two  points.  The 
curve  can  assume  any  shape 
between  them  and  therefore 
enclose  any  area.  In  our  book¬ 
shelf  analogy,  it  is  as  if  an  art 
book  has  been  shoved  be¬ 
tween  a  run  of  paperbacks.  To 
guarantee  an  error  of  at  most 
e,  some  global  knowledge  of 
the  integrand  is  needed.  One 
may  need  to  assume,  for  ex¬ 
ample,  that  the  slope  of  the 
function  is  always  less  than 
45  degrees— or  that  only  pa¬ 
perbacks  are  allowed  on  that 
shelf. 

In  summary,  an  investiga¬ 
tor  trying  to  solve  an  integral 
must  usually  do  it  numerically 
on  a  computer.  The  input  to 
the  computer  is  the  integrand 
values  at  some  points.  The 
computer  produces  an  output 
that  is  a  number  approximat¬ 
ing  the  integral. 

The  basic  concept  of 
computational  complex¬ 
ity  can  now  be  intro¬ 
duced.  We  want  to  find  the  in¬ 
trinsic  difficulty  of  solving  the 
integration  problem.  Assume 
that  determining  integrand 
values  and  using  combinatory 


operations,  such  as  addition,  multipli¬ 
cation  and  comparison,  each  have  a 
given  cost.  The  cost  could  simply  be  the 
amount  of  time  a  computer  needs  to 
perform  the  operatiorL  Then  the  com¬ 
putational  complexity  of  this  integra- 
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SAMPLING  POINTS  indicate  where  to  evaluate  functions  in  the  randomized  and  av-  a 
er^ie-case  settings.  The  points  are  plotted  in  two  dtanensioas  fw  visual  clarity.  The 
points  chosen  can  be  spaced  over  regular  intervals  such  as  grid  points  (a),  or  in 
random  positions  (b).  Two  other  types,  socalled  Hammersley  points  (c)  and  hy- 
perboUc-cross  points  {d\  represent  optimal  places  in  the  average-case  setting. 


tion  problem  can  be  defined  as  the  min¬ 
imal  cost  of  guaranteeing  that  the  com¬ 
puted  answer  is  within  an  error  thresh¬ 
old,  e,  of  the  true  value.  The  optimal 
information  operations  and  the  opti¬ 
mal  combinatory  algorithm  are  those 
that  minimize  the  cost 

Theorems  have  shown  that  the  com¬ 
putational  complexity  of  this  integra¬ 
tion  problem  is  on  the  order  of  the  re¬ 
ciprocal  of  the  error  threshold  (1/e),  bi 
other  words,  it  is  possible  to  choose  a 
set  of  information  operations  and  a 
combinatory  algorithm  such  that  the 
stdution  can  be  approximated  at  a  cost 
of  about  1/e.  It  is  impossible  to  do 
better.  With  one  variable,  or  dimension, 
the  problem  is  rather  easy.-  The  compu¬ 
tational  complexity  is  inversely  propor¬ 
tional  to  the  desir^  accuracy. 

But  if  there  are  more  dimensions  to 
this  integration  problem,  then  the  com¬ 
putational  complexity  scales  exponen¬ 
tially  with  the  number  of  variables.  If 
d  represents  the  number  of  variables, 
then  the  complexity  is  on  the  order  of 
(l/e)**— that  is,  the  reciprocal  of  the 
error  threshold  raised  to  a  power  equal 
to  the  number  of  variables.  If  one 
wants  eight-place  accuracy  (down  to 
0.00000001)  in  computing  an  integral 
that  has  three  variables,  then  the  com¬ 


plexity  is  roughly  10^*.  In  other  words, 
it  would  take  a  trillion  trillion  inte¬ 
grand  values  to  achieve  that  level  of  ac¬ 
curacy.  Even  if  one  generously  assumes 
the  existence  of  a  sequential  computer 
that  performs  10  billion  function  evalu¬ 
ations  per  second,  the  Job  would  take 
100  trillion  seconds,  or  more  than  three 
million  years.  A  computer  with  a  million 
processors  would  still  take  100  miUion 
seconds,  or  about  tliree  years. 

To  discuss  multivariate  problems 
more  generally,  we  must  introduce  one 
additional  parameter,  called  r.  This  pa¬ 
rameter  represents  'he  smoothness  of 
the  mathematical  inputs.  By  smooth¬ 
ness,  we  mean  that  the  inputs  consist 
of  functions  that  do  not  have  any  sud¬ 
den  or  dramatic  changes.  (Mathemati¬ 
cians  say  that  aU  partial  derivatives  of 
the  function  up  to  order  r  are  bound¬ 
ed.)  The  parameter  takes  on  nonnega¬ 
tive  integer  values;  increasing  values  in¬ 
dicate  more  smoothness.  Hence,  r=0 
represents  the  least  amount  of  smooth¬ 
ness  (technically,  the  integrands  are 
only  continuous— tiiey  are  rather  jagged 
but  still  connected  as  a  single  curve). 

Numerous  problems  have  a  compu¬ 
tational  complexity  that  is  on  the  order 
of  (!/£)''/'■.  For  those  of  a  more  techni¬ 
cal  persuasion,  multivariate  integra¬ 


tion,  surface  reconstruction,  partial  dif¬ 
ferential  equations,  integral  equations 
and  nonlinear  optimization  all  have  this 
computational  complexity. 

If  the  error  threshold  and  the  smooth¬ 
ness  parameter  are  fixed,  then  the  com¬ 
putational  complexity  depends  expo¬ 
nentially  on  the  number  of  dimensions. 
Hence,  the  problems  become  intractable 
for  high  dimensions.  An  impediment 
even  more  serious  than  intractability 
may  occur:  a  problem  may  be  unsolv- 
able.  A  problem  is  unsolvable  if  one 
caimot  compute  even  an  approxima¬ 
tion  at  finite  cost.  This  is  the  case  when 
the  mathematical  inputs  are  continu¬ 
ous  but  jagged.  The  smoothness  pa- 


Developing  a  Random  Approach 


In  the  1940s  physicists  working  on 
the  Manhattan  Project  at  Los  Alamos 
National  Laboratory  realized  that  some 
of  the  problems  they  were  trying  to 
solve,  such  as  the  movement  of  neu¬ 
trons  through  materials,  lay  beyond  the 
reach  of  deterministic  calculations. 

They  turned  to  the  Monte  Carlo  method 
of  Nicholas  C.  Metropolis  and  Stanislaw 
M.  Ulam.  The  strength  of  the  method  is 
that  its  error  does  not  depend  on  the 
number  of  variables  in  the  problem. 

Hence,  if  applicable,  it  breaks  the  curse 
of  dimension.  The  classical  Monte  Carlo 
method  for  multivariate  integration  re¬ 
quires  at  most  of  order  1/e*  evalua¬ 
tions  at  random  points,  where  e  is  the 
error  bound.  An  alternative  statement 
is  that  if  the  integrand  is  evaluated  at  n  random  points, 
then  the  expected  error  of  randomization  is  at  most  of  or¬ 
der  1  /Vn.  Since  its  formulation,  the  Monte  Carlo  method 
and  its  variations  have  proved  to  be  useful  to  calculate  a 


variety  of  phenomena,  from  the  size  of 
cosmic  showers  to  the  percolation  of  a  liq¬ 
uid  through  a  solid. 

For  multivariate  integration,  the  classi¬ 
cal  Monte  Carlo  method  is  optimal  only  if 
the  smoothness  parameter,  r,  of  integrands 
is  zero.  In  1 959  the  Russian  mathemati¬ 
cian  N.  S.  Bakhvalov  began  pioneering  re¬ 
search  on  the  computational  complexity 
of  multivariate  integration  in  the  random¬ 
ized  setting  and  devised  an  alternative  to 
the  Monte  Carlo  method.  Later,  in  1988, 
Erich  Novak  of  the  University  of  Erlangen- 
Numberg  extended  the  work  of  Bakhvalov 
to  establish  that  the  computational  com¬ 
plexity  in  the  randomized  setting  is  of  or¬ 
der  ( 1  /e)*,  with  s  -  2/(  1  +  2  r/d).  Note 
that  0  <  5  $  2.  If  the  smoothness  parame¬ 
ter  equals  zero,  then  s  =  2,  and  the  classical  Monte  Carlo 
method  is  optimal.  On  the  other  hand,  if  r  is  positive,  then 
the  classical  Monte  Carlo  method  is  no  longer  optimal, 
and  Bakhvalov^  method  can  be  used  instead.  , 
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rameter  is  zero,  and  the  computational 
complexity  becomes  infinite.  Hence,  for 
'  many  problems  with  a  large  number  of 
variables,  guaranteeing  that  an  approx¬ 
imation  has  a  desired  error  becomes  an 
unsolvable  or  intractable  task. 

Mathematically,  the  computational 
complexity  results  we  have  described 
apply  to  the  so<alled  worst-case  deter¬ 
ministic  setting.  The  ‘Vorst  case”  phras¬ 
ing  comes  from  the  fact  that  the  ap¬ 
proximation  provides  a  guarantee  that 
the  error  always  falls  within  e.  In  other 
words,  for  multtvariate  integration,  an 
approximatioii  within  the  error  thresh¬ 
old  is  guaranteed  for  every  integrand 
that  has  a  given  smoothness.  The  word 
"deterministic”  arises  from  the  fact 
that  the  integrand  is  evaluated  at  deter¬ 
ministic  (in  contrast  to  random)  points. 

In  this  worst-case  deterministic  set¬ 
ting,  many  multivariate  problems  are 
unsolvable  or  intractable.  Because  these 
results  are  inninsic  to  the  problem, 
one  cannot  get  around  them  by  invent¬ 
ing  other  methods. 

One  possible  way  to  break  un- 
solvability  and  intractability  is 
through  randomization.  To  il¬ 
lustrate  how  randomization  works,  we 
will  again  use  multivariate  integratioa 
Instead  of  picking  points  deterministi¬ 
cally  or  even  optimaUy,  we  allow  (in  an 
informal  sense)  a  coin  toss  to  make  the 
decisions  for  us.  A  loose  analogy  might 
be  sampling  polls.  Rather  than  ask  ev¬ 
ery  registered  voter,  a  pollster  conducts 
a  small,  random  sampling  to  determine 
the  Uk^  winner. 

Theorems  indicate  that  with  a  ran¬ 
dom  selection  of  points,  the  computa¬ 
tional  complexity  is  at  most  on  the  or¬ 
der  of  the  reciprocal  of  the  square  of 
the  error  threshold  (1/e^).  Thus,  the 
problem  is  always  tractable,  even  if  the 


smoothness  parameter  is  equal  to  zero. 

The  workhorse  of  the  randomized 
approach  has  been  the  Monte  Carlo 
method.  Nicholas  C.  Metropolis  and 
Stanislaw  M.  Ulam  suggested  the  idea 
in  the  1940s.  In  the  classical  Monte 
Carlo  method  the  integrand  is  evaluat¬ 
ed  at  uniformly  distributed  random 
points.  The  arithmetic  mean  of  these 
function  values  then  serves  as  the  ap¬ 
proximation  of  die  integral. 

Amazingly  enough,  for  multivariate 
integration  problems,  randomization 
of  this  kind  makes  the  computational 
complexity  independent  of  dimensioa 
Problems  that  are  unsolvable  or  intrac¬ 
table  if  computed  from  the  best  possi¬ 
ble  deterministic  points  become  trac¬ 
table  if  approached  randomly.  (If  r  is 
positive,  however,  then  the  classical 
Monte  Carlo  meth^  is  not  the  optimal 
one;  see  box  on  the  opposite  page.) 


One  does  not  get  so  much  for  noth¬ 
ing.  The  price  that  must  be  paid  for 
breaking  the  unsolvability  or  intracta¬ 
bility  is  that  the  ironclad  guarantee  that 
the  error  is  at  most  e  is  lost.  Instead 
one  is  left  only  with  a  weaker  guaran¬ 
tee  that  the  error  is  probably  no  more 
than  e— much  as  a  preelection  poll  is 
usuaUy  correct  but  might,  on  occasion, 
predict  a  witmg  winner.  In  other  words, 
a  worst-case  guarantee  is  impossible; 
one  must  be  content  with  a  weaker 
assurance. 

Randomization  makes  multivariate 
integration  and  many  other  important 
problems  computationally  feasible.  It 
is  not,  however,  a  cure-all.  Randomiza¬ 
tion  fails  completely  for  some  kinds  of 
problems.  For  instance,  in  1987  Greg  W. 
Wasilkowski  of  the  University  of  Ken¬ 
tucky  showed  that  randomization  does 
not  break  intractability  for  surface  re- 


Average-Case  Complexity 

In  the  text,  we  mention  that  the  average-case  complexity  of  multivariate  in¬ 
tegration  is  on  the  order  of  the  reciprocal  of  the  error  threshold  ( I  /e)  and 
that  for  surface  reconstruction,  it  is  the  square  of  that  reciprocal  (1/e^).  For 
simplicity,  we  ignored  some  multiplicative  fiiaors  that  depend  on  dand  e. 
Here  we  provide  more  rigorous  statements. 

The  average  computational  complexity,  comp  *^(e,  d;  INT),  of  multivari¬ 
ate  integration  is  bounded  by 

9\(d)  f  1  32(d)  (  1 

— i— ('oSi")  ^  comp»'^(e,  d;INT)  <  T”  I 

The  average  computational  complexity,  comp  *^(E,  d;  SUR),  of  surface  re¬ 
construction  is  bounded  by 

02(d)  (  g^(d)  (  1  y<rf-i) 

g2  S  comp*'^  (e,  d;  SUR)  S  gz  I '“Sjj 

Good  estimates  of  g|  (d),  g2(d),  gjid)  and  g^id)  are  currently  not  known. 
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construction.  Is  there  an  approach  that 
does  and  that  works  over  a  broad  class 
of  mathematics  problems? 

There  is  indeed.  It  is  the  average<ase 
setting,  in  which  we  seek  to  break  un- 
solvability  and  intractability  by  replac¬ 
ing  a  worst<ase  guarantee  with  a  weak¬ 
er  one:  that  the  expected  error  is  at 
most  e.  The  average-case  setting  im¬ 
poses  restrictions  on  the  kind  of  math¬ 
ematical  inputs.  These  restrictions  are 
chosen  to  represent  what  would  hap¬ 
pen  most  of  the  time.  Technically,  the 
constraints  are  described  by  probabili¬ 
ty  distributions;  the  distributions  de¬ 
scribe  the  likelihood  that  certain  Inputs 
occur.  The  most  commonly  used  distri¬ 
butions  are  Gaussian  measures  and,  in 
particular,  Wiener  measures. 

Althou^  it  was  Known  since  th4 
1960s  that  multivariate  integration  is 
tractable  on  the  average,  the  proof  was 
nonconstructive.  That  Is,  it  did  not  spec¬ 
ify  the  optimal  points  to  evaluate  the 
integrant  the  optimal  combinatory  al¬ 
gorithm  and  the  average  computation¬ 
al  complexity.  Attempts  to  apply  ideas 
from  other  areas  of  computation  to  de¬ 
termine  these  unknowns  did  not  work. 


For  example,  evaluating  the  integrand 
at  regulai^  spaced  points,  such  as  those 
on  a  grid,  are  often  used  in  computa¬ 
tion.  But  theorems  have  shown  them  to 
be  poor  choices  for  the  average-case 
setting.  A  proof  was  given  in  1975  by 
N.  Donald  Ylvisaker  of  the  University 
of  Cahfomia  at  Ins  Angeles.  It  was  later 
generalized  in  1990  by  Wasilkowski  and 
Anargyros  Papageorgiou,  then  studying 
for  his  Ph.D.  at  Cohunbia  University. 

The  solution  came  in  1991,  when 
Woiniakowski  found  the  construction. 
As  sometimes  happens  in  science,  a  re¬ 
sult  from  number  theory,  a  branch  of 
mathematics  far  removed  from  aver- 
age<ase  complexity  theory,  was  crucial. 
Part  of  the  key  came  from  work  on 
number  theory  by  Klaus  F.  Roth  of  Im¬ 
perial  College,  London,  a  1958  Fields 
Medalist.  Another  part  was  provided  by 
recent  work  by  Wasilkowski 

Let  us  describe  the  result  more  pre¬ 
cisely.  First,  put  the  smoothness  para¬ 
meter  at  zero— that  is,  tackle  a  problem 
that  is  unsolvable  in  the  worst-case  de¬ 
terministic  setting.  Next,  assume  that 
integrands  are  distributed  according  to 
a  Wiener  measure.  If  we  ignore  certain 


multiplicative  factors  for  simplicity’s 
sake,  the  average  computational  com¬ 
plexity  has  been  proved  to  be  inversely 
proportional  to  the  error  threshold  (on 
the  order  of  1/e)  [see  box  on  page  5). 
For  small  errors,  the  result  is  a  tiuijor 
improvement  over  the  classical  Monte 
Carlo  method,  in  which  the  cost  is  in¬ 
versely  proportional  to  the  square  of 
the  error  thmhold  ( 1/e^). 

The  average  case  offers  a  different 
kind  of  assurance  from  that  provided 
by  the  randomized  (Monte  Carlo)  set¬ 
ting.  The  error  in  the  average<ase  set¬ 
ting  depends  on  the  distribution  of  the 
integrands,  whereas  the  error  in  the 
randomized  setting  depends  on  a  dis¬ 
tribution  of  the  sample  points.  In  our 
books-on-a-shelf  analogy,  the  distribu¬ 
tion  in  the  average-case  setting  might 
rule  out  the  inclusion  of  many  oversize 
books,  whereas  the  distribution  in  the 
randomized  setting  determines  which 
books  are  to  be  sampled. 

In  the  average<ase  setting  the  opti¬ 
mal  evaluation  points  must  be  deter- 
ministicaUy  chosen.  The  best  points  are 
Hammersley  points  or  hyperbolic-cross 
points  [see  illustration  on  pages  4 
and  5).  These  deterministic  points 
offer  a  better  sampling  than  randomly 
selected  or  regularly  spaced  (or  grid) 
points.  They  make  what  would  be  im¬ 
possible  to  solve  tractable  on  average. 

Is  surface  reconstruction  also  tracta¬ 
ble  on  the  average?  This  query  is  par¬ 
ticularly  important  because,  as  already 
mentioned,  randomization  does  not 
help.  Under  the  same  assumptions  we 
used  for  integration,  we  find  that  the 
average  computational  complexity  is  on 
the  order  of  l/e^.  Hence,  surface  re¬ 
construction  becomes  tractable  on  av¬ 
erage.  As  was  the  case  for  integration, 
hyperbolic-cross  points  are  optimal. 

We  are  now  testing  whether  the  aver¬ 
age  case  is  a  practical  alternative.  A 
PhD.  student  at  Columbia,  Spassimir  H. 
Paskov,  is  developing  softivare  to  com¬ 
pare  the  deterministic  techniques  with 
Monte  Carlo  methods  for  integration. 
Preliminary  results  obtained  by  testing 
certain  finiuice  problems  suggest  the 
superiority  of  the  deterministic  meth¬ 
ods  in  practice. 

In  our  simplified  description,  we  ig¬ 
nored  a  multiplicative  factor  that  affects 
the  computational  complexity.  This  fac¬ 
tor  depends  on  the  nuinber  of  variables 
in  the  problertL  When  the  number  of 
variables  is  large,  that  factor  can  be¬ 
come  huge.  Good  theoretical  estimates 
of  the  factor  are  not  known,  and  obtain¬ 
ing  them  is  believed  to  be  very  hard. 

Wozniakowski  uncovered  a  solution: 
get  rid  of  that  factor.  Spedficalty,  we  say 
a  problem  is  strongly  tractable  if  the 
nuinber  of  function  evaluations  needed 


Discrete  Computational  Complexity 

This  article  discusses  intractability  and  breaking  of  intractability  for  multi¬ 
variate  integration  and  surface  reconstruction.  These  are  two  examples 
of  continuous  problems.  But  what  is  known  about  the  computational  com¬ 
plexity  of  discrete,  rather  than  continuous,  problems?  The  famous  traveling 
salesman  problem  is  an  example  of  a  discrete  problem,  in  which  the  goal  is 
to  visit  various  cities  in  the  shortest  distance  possible. 

A  discrete  problem  is  in¬ 
tractable  if  its  computational 
complexity  increases  exponen¬ 
tially  with  the  number  of  its  in¬ 
puts.  The  intractability  of  many 
discrete  problems  in  the  worst- 
case  deterministic  setting  has 
been  conjectured  but  not  yet 
proved.  What  is  known  is  that 
hundreds  of  discrete  problems 
all  have  essentially  the  same 
computational  complexity.  That 
means  they  are  all  tractable  or 
all  intractable,  and  the  common 
belief  among  experts  is  that 
they  are  all  intractable.  For  tech¬ 
nical  reasons,  these  problems 
are  said  to  be  NP-complete.  One 
of  the  great  open  questions  in 
discrete  computational  complex¬ 
ity  theory  is  whether  the  NP- 
complete  problems  are  Indeed 
intractable  (see  Turing  Ma¬ 
chines,*  by  John  E.  Hopcroft;  Sci¬ 
entific  American,  May  1 984]. 
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REENTRY  OF  SPACE  SHUTTLE  provides  an  example  of  a  computadonally  complex 
task;  modeling  of  the  airflow  around  the  craft.  This  job  is  difficult  even  though 
only  seven  variables  govern  the  dynamics.  Added  dimensions  may  yield  i»oblems 
that  can  never  be  solved  and  thus  limit  what  is  scientifically  knowable. 


for  the  solution  is  completely  indepen¬ 
dent  of  the  number  of  variables.  Instead 
it  would  depend  only  on  a  power  of 
1/e.  The  possibility  seems  too  mudi 
tp  hope  for,  but  it  was  proved  last  year 
tiiat  multivariate  integration  and  sur¬ 
face  reconstruction  are  both  strongly 
tractable  on  the  average. 

A  final  obstacle  must  be  overcome 
before  these  new  results  can  be  used. 
We  know  there  must  exist  evaluation 
points  and  a  combinatory  algorithm 
that  make  integration  and  surface  re¬ 
construction  strongly  tractable  on  the 
average.  Unfortunately,  the  proof  of 
this  result  does  not  tell  us  what  the 
points  and  algorithms  are,  thus  leaving 
a  beautiful  challenge  for  the  future. 

Work  on  information-based 
complexity  has  led  one  of  us 
(Traub)  to  speculate  that  it 
might  be  poss^le  to  prove  formally 
that  certain  sdehtiftc  questions  are  un¬ 
answerable.  The  proposed  attack  is  to 
prove  that  the  computing  resources 
(time,  memory,  energy)  do  not  exist  in 
the  universe  to  answer  such  questions. 

One  important  achievement  of  math¬ 
ematics  over  the  past  60  years  is  the 
idea  that  mathematical  problems  may 
be  imdeddable,  noncomputable  or  in¬ 
tractable.  Kurt  Gddel  proved  the  ftrst 
of  these  results.  He  established  that  in 
a  sufficiently  rich  mathematical  sys¬ 
tem,  such  as  arithmetic,  there  are  theo¬ 
rems  that  can  never  be  proved. 

We  believe  it  is  time  to  up  the  ante 
and  try  to  prove  there  are  unanswer¬ 
able  scientific  questions.  In  other  words, 
we  would  like  to  establish  a  physical 
Gddel’s  theorem.  The  process  offers  a 
markedly  different  challenge  from  prov¬ 
ing  residts  about  mathematical  prob¬ 
lems,  because  a  scientific  question  does 
not  come  equipped  with  a  mathemati- 
ca'  formulation.  Such  questions  include 
when  the  universe  will  stop  expanding 
and  what  the  average  global  tempera¬ 
ture  will  be  in  the  year  2001. 

Why  do  intractability  results  suggest 
that  some  sdentifk;  questions  might  be 
unanswerable?  Recall  the  results.  In  the 
worst-case  deterministic  setting,  the 
computational  conqilexity  of  many  con¬ 
tinuous  problems  grows  exponentially 
with  dimension.  Also,  the  computation¬ 
al  complexity  of  many  discrete  prob¬ 
lems  is  cotijectured  to  grow  exponen¬ 
tially  with  the  number  of  inputs  (see 
box  on  opposite  page].  Furthermore,  al¬ 
though  some  problems  are  tractable  in 
the  randomized  or  average-case  set¬ 
tings,  it  has  been  proved  that  others  re¬ 
main  intractable.  Such  problems  may 
lurk  in  certain  supercomputing  tasks 
or  questions  regarding  the  foundations 
of  physics.  After  all,  they  involve  a  large 


number  of  variables  or  particles.  Even 
worse,  many  physics  problems  require 
solutions  to  a  kind  of  integral  called 
a  path  integral,  which  has  an  infinite 
number  of  dimensions.  Solutions  of 
path  integrals  invite  high-dimensional 
approximations.  Thus,  the  intractabili¬ 
ty  results  and  cohjectures  are  certainly 
daunting  because  they  suggest  that 
many  tasks  with  a  large  number  of 
variables  or  objects  might  be  impossi¬ 
ble  to  solve. 

We  emphasize  the  possibility  of  oth¬ 
er  impediments  to  answering  scientific 
questions.  One  is  chaos,  the  extreme 
sensitivity  to  initial  conditions.  Because 
the  pre<i.se  initial  conditions  are  either 
not  known  or  cannot  be  exactly  entered 
into  a  JiPiSfJ  computer,  certain  ques¬ 
tions  about  the  behavior  of  a  chaotic 
system  cannot  be  answered.  To  focus 
on  the  issue  at  hand,  we  limit  ourselves 
to  intractability. 

As  we  have  already  observed,  a  scien¬ 
tific  question  does  not  come  equipped 
with  a  mathematical  formulation.  Each 
of  a  number  of  models  might  capmre 
the  essence  of  a  scientific  question.  Be¬ 
cause  intractability  results  refer  to  a 
particular  mathematical  formulation,  it 
might  happen  that  although  a  partic¬ 
ular  mathematical  formulation  is  in¬ 
tractable,  another  formulation  may  be 
found  that  is  indeed  tractable.  This 
prospect  indicates  a  possible  way  to 
prove  the  existence  of  unanswerable 
scientific  questions.  We  can  attempt  to 
show  that  there  exist  scientific  ques¬ 
tions  such  that  every  mathematical  for¬ 
mulation  that  captures  the  essence  of 


the  question  is  intractable.  We  would 
therefore  have  science’s  version  of  Gd- 
del’s  theorem. 

Humans  are  intrigued  not  only  by  the 
unknown  but  also  by  the  unknowable. 
Here  we  have  suggested  one  possible 
attack  to  establish  what  may  be  forever 
unknowable  in  science.  The  curse  of  di¬ 
mension,  broken  now  for  many  kinds 
of  problems,  may  yet  cast  its  sp^. 
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